The phonology of Standard Chinese is reproduced below. Actual production varies widely among speakers, as people inadvertently introduce elements of their native dialects. By contrast, television and radio announcers are chosen for their pronunciation accuracy and standard accent.
Contents |
The following is the consonant inventory of Standard Chinese, transcribed in the International Phonetic Alphabet (IPA):
Bilabial | Labio- dental |
Alveolar | Retroflex | Alveolo- palatal |
Palatal | Velar | ||
---|---|---|---|---|---|---|---|---|
Nasal | m | n | ŋ | |||||
Plosive | p pʰ | t tʰ | k kʰ | |||||
Affricate | t͡s t͡sʰ | t͡ʂ t͡ʂʰ | t͡ɕ t͡ɕʰ ² | |||||
Fricatives | f | s | ʂ | ɻ~ʐ ¹ | ɕ ² | x | ||
Rhotic | ||||||||
Approximant | l | (j) (ɥ) ³ | (w) ³ |
All but /ŋ/ occur in syllable onsets (as "initials"), whereas only /n/, /ŋ/, and /ɻ/ occur as syllable codas. [m] may occur as /n/'s allophone before [p], [pʰ], [m] when speaking quickly.
The retroflex consonants are flat apical postalveolar (Ladefoged & Wu 1984; Ladefoged & Maddieson 1996:150-154). See retroflex consonants.
The alveolo-palatal consonants [t͡ɕ t͡ɕʰ ɕ] are in complementary distribution with the alveolar consonants [t͡s t͡sʰ s], retroflex consonants [t͡ʂ t͡ʂʰ ʂ], and velar consonants [k kʰ x], which they derive from historically. As a result, linguists often prefer to classify [t͡ɕ t͡ɕʰ ɕ] as allophones of the other three series. The Yale and Wade-Giles systems mostly treat the palatals as allophones of the retroflex consonants; Tongyong Pinyin mostly treats them as allophones of the alveolars; and Chinese braille treats them as allophones of the velars. In Hanyu Pinyin they are considered apart, however.
The collapse of the velar and alveolar sibilant series into the alveolo-palatal in palatalizing environments happened only a few centuries ago. Before then, some instances of modern [t͡ɕ(ʰ)i] were instead [k(ʰ)i], and others were [t͡s(ʰ)i] . The change took place in the last two or three centuries at different times in different areas, but not in the dialect used in the Manchu dynasty imperial court. This explains why some European transcriptions of Chinese names (especially in the postal map spelling) contain "ki-", "hi-", "tsi-" or "si-". Examples are "Peking" for Beijing; "Chungking" for Chongqing; "Fukien" for Fujian (a province); "Tientsin" for Tianjin; "Sinkiang" for Xinjiang; "Sian" for Xi'an. The complementary distribution with the retroflex series appeared as syllables that had a retroflex consonant followed by a medial glide lost the latter.
[t͡ɕ t͡ɕʰ ɕ] may be pronounced [t͡sj t͡sʰj sj], which is characteristic of the speech of young women, and also of some men. This is considered rather effeminate and may also be substandard.
The null initial, written as an apostrophe in pinyin word-medially, is most commonly realized as [ɰ] , though [n], [ŋ], [ɣ], and [ʔ] are common in nonstandard Mandarin dialects; some of these correspond to null in Standard Chinese but contrast with it in their dialect.
Corresponding chart in:
Standard Chinese has approximately half a dozen vowels. Phonetically, the following phones may be distinguished:
At first glance, these would appear to constitute a system of eight phonemes: /a/ ([a ~ ä ~ ɑ]), /e/ ([e ~ ɛ ~ œ]), /o/ ([o ~ ɔ]), /ə/ ([ə ~ ɤ ~ ʌ]), /ɨ/ ([z̩ ~ ʐ̩]), /i/ ([i]), /u/ ([ʊ ~ u]), and /y/ ([y]). However, the mid vowels /e/, /o/, /ə/ are in complementary distribution, and are therefore treated as a single phoneme /ə/. Exceptions include exclamations that can be treated as outside of the core system (similar to the normal treatment of "hmm", "unh-unh", "shhh!" and other English exclamations that violate usual syllabic constraints): [ɛ] – [ɔ] (e.g. the interjections 喔, 哦 and 噢) – [ɰʌ] (e.g. 饿 "hungry", 鹅 "goose"), [jɛ] (e.g. 夜 "night", 爷 "grandfather") – [jɔ] (e.g. the interjection 哟), [lə] (e.g. 乐 "glad") – [lo] (e.g. the interjection 咯). Nonetheless, disregarding these exceptions would result in a six-vowel system.
It would also be possible to merge /ɨ/ and /i/, which are historically related, since they are also in complementary distribution, provided that the alveolo-palatal series is either left unmerged, or merged with the velars rather than the retroflex or alveolar series. (That is, [t͡ɕi], [t͡sɨ] and [t͡ʂɨ] all exist, but there is neither *[ki] nor *[kɨ], so there is no problem merging both [i]~[ɨ] and [k]~[t͡ɕ] at the same time.) The result is a five-vowel system of /a/, /ə/, /i/, /u/, and /y/.
The medials /j, w, ɥ/ can also be merged to the high vowels /i, u, y/ — there is no ambiguity in interpreting a sequence like [jɑʊ̯] as /iau/, and potentially problematic sequences such as */iu/ never occur. This results in a minimal system with 19 consonants and 5 vowels.
An alternative and potentially more abstract system that sometimes appears in the linguistic literature (e.g. in Mantaro Hashimoto and Edwin Pulleyblank)[1] uses the opposite approach of analyzing the vowels /i/, /u/ and /y/ as the surface form of the glides /j, w, ɥ/ combined with a null meta-phoneme Ø. In this system, shown below, there are just two vowel nuclei, /a/ and /ə/; various allophones result from a preceding glide /j, w, ɥ/ (or null) and a coda /i~j, u~w, n, ŋ/ (or null; see erhua for the additional sequences afforded by the rhotic coda /ɻ/). (The minimal vowel /ɨ/ is ascribed to the surface manifestation of all three values being null, e.g. [sɨ] would be pronounced like an underlying syllable /s/.)
Nucleus | Coda | Medial | |||
Ø | j | w | ɥ | ||
a | Ø | ä | jä | wä | |
i | aɪ̯ | waɪ̯ | |||
u | äʊ̯ | jäʊ̯ | |||
n | an | jɛn | wan | ɥœ̜n | |
ŋ | ɑŋ | jɑŋ | wɑŋ | ||
ə | Ø | ɤ | jɛ | wɔ ¹ | ɥœ̜ ² |
i | eɪ̯ | weɪ̯ | |||
u | oʊ̯ | joʊ̯ | |||
n | ən | in | wən | yn | |
ŋ | ɤŋ | iŋ | wɤŋ ~ ʊŋ ³ |
jʊŋ | |
Ø | z̩~ʐ̩ | i | u | y |
¹ Both pinyin and zhuyin have an additional "o", used after "b p m f", which is distinguished from "uo", used after everything else. "o" is generally put into the first column instead of the third. However, in Beijing pronunciation, these are identical.
² Another way to represent the four finals of this line is: [ɰʌ jɛ wɔ ɥœ], which reflects Beijing pronunciation.
³ /wɤŋ/ is pronounced [ʊŋ] when it follows an initial.
The sequence [jɛn] can be considered to be phonemically either /jən/ or /jan/; likewise [ɥɛn] could be either /ɥən/ or /ɥan/. Since [jɛn] and [ɥɛn] become [jɐɻ] and [ɥɐɻ] with the addition of a suffix /ɻ/, the latter interpretation is generally preferred.
Syllables in Standard Chinese have the maximal form CGVCT, where the first C is the initial consonant; G is one of the glides /j w ɥ/; V is a vowel (or diphthong); the second C is a coda, /n ŋ ɻ/ (if diphthongs like ou, ai are analyzed as V) or /n ŋ ɻ j w/ (if not); and T is the tone. In traditional Chinese phonology, C is called the "initial", G the "medial", and VFT the "final" or "rime"; sometimes the medial is considered part of the rime.
Not counting tone distinctions or the rhotic coda, there are some 35 finals in Standard Chinese. They can be seen at:
Tables of all syllables (excluding tone and rhotic coda) are at:
Standard Chinese also uses a rhotic consonant, /ɻ/. This usage is a unique feature of Mandarin dialects, especially the Beijing dialect; other dialects lack this sound. In Chinese, this feature is known as Erhua. There are two cases in which it is used:
The "r" final must be distinguished from the retroflex consonant written <ri> in pinyin and [ʐ] in IPA. "The star rode a donkey" in some rhotic English accents, and 我女兒入醫院/我女儿入医院 Wǒ nǚ'ér rù yīyuàn "My daughter entered/enters the hospital" in Standard Chinese, both have a first r pronounced with a relatively lax tongue, and a the second /r/ sounds involving an active retraction of the tongue and contact with the top of the mouth.
In other Mandarin dialects, the rhotic consonant is sometimes replaced by another syllable, such as li, in words that indicate locations. For example, 這兒/这儿 zhèr "here" and 那兒/那儿 nàr "there" become 這裡/这里 zhèli and 那裡/那里 nàli, respectively.
Standard Chinese, like all Chinese dialects, is a tonal language. This means that tones, just like consonants and vowels, are used to distinguish words from each other. Many foreigners have difficulties mastering the tones of each character, but correct tonal pronunciation is essential for intelligibility because of the vast number of words in the language that only differ by tone (i.e. are minimal pairs with respect to tone). Statistically, tones are as important as vowels in Standard Chinese.[2] The following are the 4 tones of Standard Chinese:
Tone name | Yin Ping | Yang Ping | Shang | Qu |
---|---|---|---|---|
Tone number | 1 | 2 | 3 | 4 |
Pinyin diacritic | ā | á | ǎ | à |
Tone letter | ˥˥ (55) | ˧˥ (35) | ˨˩, ˨˩˦ (21, 214) | ˥˩ (51) |
IPA diacritic | á | ǎ | à, a᷉ | â |
Also called fifth tone or zeroth tone (in Chinese: 輕聲/轻声 qīng shēng, literal meaning: "light tone"), neutral tone is sometimes thought of as a lack of tone. It usually comes at the end of a word or phrase, and is pronounced in a light and short manner. Because of this characteristic, and because there is no standard rule for whether a syllable has a neutral tone, it is considered analogous to an unstressed syllable. The neutral tone has a large number of allophones: its pitch depends almost entirely on the tone of the preceding syllable. The situation is further complicated by the amount of dialectal variation associated with it; in some regions, notably Taiwan, the neutral tone is relatively uncommon.
Despite many examples of minimal pairs (for example, 要是 and 钥匙, yàoshì if and yàoshi key, respectively), it is sometimes described as something other than a full-fledged tone for technical reasons: some linguists feel that it results from a "spreading out" of the tone on the preceding syllable. This idea is appealing intuitively because without it, the neutral tone needs relatively complex tone sandhi rules to be made sense of; indeed, it would have to have 4 allotones, one for each of the four tones that could precede it. However, the "spreading" theory incompletely characterizes the neutral tone, especially in sequences where more than one neutrally toned syllable are found adjacent.[5]
The following are from Beijing dialect.[6] Other dialects may be slightly different.
Tone of first syllable | Pitch of neutral tone | Example | Pinyin | English meaning |
---|---|---|---|---|
1 ˥ | ˨ (2) | 玻璃 (˥.˨) | bōli | glass |
2 ˧˥ | ˧ (3) | 伯伯 (˧˥.˧) | bóbo | uncle |
3 ˨˩ | ˦ (4) | 喇叭 (˨˩.˦) | lǎba | horn |
4 ˥˩ | ˩ (1) | 兔子 (˥˩.˩) | tùzi | rabbit |
Most romanizations represent the tones as diacritics on the vowels (e.g., Hanyu Pinyin, MPS II and Tongyong Pinyin). Zhuyin uses diacritics as well. Others, like Wade-Giles, use superscript numbers at the end of each syllable. The tone marks and numbers are rarely used outside of language textbooks. Gwoyeu Romatzyh is a rare example where tones are not represented as special symbols, but using normal letters of the alphabet (although without a one-to-one correspondence).
To listen to the tones, see http://www.wku.edu/~shizhen.gao/Chinese101/pinyin/tones.htm (click on the blue-red yin yang symbol).
Pronunciation also varies with context according to the rules of tone sandhi. The most prominent phenomenon of this kind is when there are two third tones in immediate sequence, in which case the first of them changes to a rising tone, the second tone. In the literature, this contour is often called two-thirds tone or half-third tone, though generally, in Standard Chinese, the "two-thirds tone" is the same as the second tone. If there are three third tones in series, the tone sandhi rules become more complex, and depend on word boundaries, stress, and dialectal variations.
"一" (yī) and "不" (bù) have special rules which do not apply to other Chinese characters:
Relationship between Middle Chinese and modern tones:
V- = unvoiced initial consonant
L = sonorant initial consonant
V+ = voiced initial consonant (not sonorant)
Middle Chinese | Tone | Ping (平) | Shang (上) | Qu (去) | Ru (入) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initial | V- | L | V+ | V- | L | V+ | V- | L | V+ | V- | L | V+ | |
Standard Chinese | Tone name | Yin Ping (陰平, 1) |
Yang Ping (陽平, 2) |
Shang (上, 3) |
Qu (去, 4) |
redistributed with no pattern |
to Qu | to Yang Ping | |||||
Tone contour | 55 | 35 | 214 | 51 | to 51 | to 35 |
It is known that if the two morphemes of a compound word cannot be ordered by grammar, the order of the two is usually determined by tones — Yin Ping (1), Yang Ping (2), Shang (3), Qu (4), and Ru, which is the plosive-ending tone that has already disappeared. Below are some compound words that show this rule. Tones are shown in parentheses, and R indicates Ru.
The stress pattern of Chinese language is made up of three degrees of stress. There are three stress patterns, which commonly occur in the two-syllable compound words:[7]
|
|